18 research outputs found

    Revisiting Taxonomy Induction over Wikipedia

    Get PDF
    Guided by multiple heuristics, a unified taxonomy of entities and categories is distilled from the Wikipedia category network. A comprehensive evaluation, based on the analysis of upward generalization paths, demonstrates that the taxonomy supports generalizations which are more than twice as accurate as the state of the art. The taxonomy is available at http://headstaxonomy.com

    GIANT: Scalable Creation of a Web-scale Ontology

    Full text link
    Understanding what online users may pay attention to is key to content recommendation and search services. These services will benefit from a highly structured and web-scale ontology of entities, concepts, events, topics and categories. While existing knowledge bases and taxonomies embody a large volume of entities and categories, we argue that they fail to discover properly grained concepts, events and topics in the language style of online population. Neither is a logically structured ontology maintained among these notions. In this paper, we present GIANT, a mechanism to construct a user-centered, web-scale, structured ontology, containing a large number of natural language phrases conforming to user attentions at various granularities, mined from a vast volume of web documents and search click graphs. Various types of edges are also constructed to maintain a hierarchy in the ontology. We present our graph-neural-network-based techniques used in GIANT, and evaluate the proposed methods as compared to a variety of baselines. GIANT has produced the Attention Ontology, which has been deployed in various Tencent applications involving over a billion users. Online A/B testing performed on Tencent QQ Browser shows that Attention Ontology can significantly improve click-through rates in news recommendation.Comment: Accepted as full paper by SIGMOD 202

    Modeling, Indexing and Retrieving Images using Conceptual Graphs

    No full text
    . When dealing with the complexity of an image as part of the indexing process, keywords are not sufficient to obtain an index that is a faithful representation of the image content. We propose to use the conceptual graphs formalism as the indexing language, which allows to use not only keywords, but also relations between them. The obtained indexes are more precise, and retrieval effectiveness is thus improved. Our paper presents a system that provides a computer-assisted image indexing process, which is performed according to a formal image model. The result of the indexing process, which is a set of conceptual graphs, is then organized so that to improve retrieval execution times. Our image retrieval system, called RELIEF, is implemented on an object-oriented DBMS and is available on the Web. It ensures the management of an image test collection and gives good results, with respect to both execution time and quality of answers. 1 Introduction: Towards Precision-Oriented ..

    Finding the Best Parameters for Image Ranking: a User-Oriented Approach

    No full text
    Image ranking is a task that involves different parameters. They depend on the intrinsic characteristics of an image, but also on the indexing language used for representing its semantic content. We developed a weighting model that combines these parameters in a general scheme. Finding the best balance between the parameters is not straightforward. Different parameter combinations leads to different rankings, which may be more or less accepted by the users. In this paper, we choose a set of test queries and present the impact of the parameters on the rank of each image. Different combinations are discussed, and the best combination is specified. For the evaluation, we follow a user-oriented approach, and compare the ranking provided by each parameter combination to the ranking given by human judgment. This is a step toward a user-centered image retrieval system, which will dynamically adapt to the user's profile and preferences. 1 Introduction Images constitute a complex type of medi..

    RELIEF: Combining expressiveness and rapidity into a single system

    No full text
    This paper constitutes a proposal for an efficient and effective logical information retrieval system. Following a relational indexing approach, which is in our opinion a necessity to cope with the emerging applications such as those based on multimedia, we use the conceptual graphs formalism as our indexing language. This choice allows for relational indexing support and captures all the useful properties of the logical information retrieval model, in a workable system. First order logic and standard information retrieval techniques are combined together, to the same effect: obtaining an expressive system, able to accurately handle complex documents, improve retrieval effectiveness, and achieve good time performance. Experimentations on an image test collection, within a system available on the Web, provide an illustration of the role that logic may have in the future development of information retrieval systems. 1 Introduction The emergence of new applications, such as those based ..

    The RELIEF Retrieval System

    No full text
    We introduce in this paper the RELIEF retrieval system, a system for image retrieval based on the conceptual graph formalism. Conceptual graphs can be used as a simple and expressive language for indexing and retrieving non-- textual documents. In this formalism the implementation of the matching function between a query and a document is obtained by using the so-called projection operator between two conceptual graphs. However, the first implementations of this operator have shown its lack of efficiency, as it is based on an exponential algorithm. The RELIEF system supports a new polynomial matching function for conceptual graphs which turns out to be equivalent to the projection operator but it has the important feature that it incorporates reasoning about relations. This allows for better qualitative results as well as improved time performance compared to the original system implementing the classical projection. RELIEF is developed on top of the object oriented DBMS O 2 , and any ..

    High Performance Question/Answering

    No full text
    In this paper we present the features of a Question/Answering (Q/A) system that had unparalleled performance in the TREC-9 evaluations. We explain the accuracy of our system through the unique characteristics of its architecture: (1) usage of a wide-coverage answer type taxonomy; (2) repeated passage retrieval; (3) lexico-semantic feedback loops; (4) extraction of the answers based on machine learning techniques; and (5) answer caching. Experimental results show the eects of each feature on the overall performance of the Q/A system and lead to general conclusions about Q/A from large text collections
    corecore